Urdu to Punjabi Machine Translation: An Incremental Training Approach

نویسندگان

  • Umrinderpal Singh
  • Vishal Goyal
  • Gurpreet Singh Lehal
چکیده

The statistical machine translation approach is highly popular in automatic translation research area and promising approach to yield good accuracy. Efforts have been made to develop Urdu to Punjabi statistical machine translation system. The system is based on an incremental training approach to train the statistical model. In place of the parallel sentences corpus has manually mapped phrases which were used to train the model. In preprocessing phase, various rules were used for tokenization and segmentation processes. Along with these rules, text classification system was implemented to classify input text to predefined classes and decoder translates given text according to selected domain by the text classifier. The system used Hidden Markov Model(HMM) for the learning process and Viterbi algorithm has been used for decoding. Experiment and evaluation have shown that simple statistical model like HMM yields good accuracy for a closely related language pair like Urdu-Punjabi. The system has achieved 0.86 BLEU score and in manual testing and got more than 85% accuracy. Keywords—Machine Translation; Urdu to Punjabi Machine Translation; NLP; Urdu; Punjabi; Indo-Aryan Languages

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AGHAZ: An Expert System Based approach for the Translation of English to Urdu

–Machine Translation (MT ) of English text to its Urdu equivalent is a difficult challenge. Lot of attempts has been made, but a few limited solutions are provided till now. We present a direct approach, using an expert system to translate English text into its equivalent Urdu, using The Unicode Standard, Version 4.0 (ISBN 0-321-18578-1) Range: 0600–06FF. The expert system works with a knowledg...

متن کامل

Qualitative Analysis of Contemporary Urdu Machine Translation Systems

The diversity in source and target languages coupled with source language ambiguity makes Machine Translation (MT) an exceptionally hard problem. The highly information intensive corpus based MT leads the MT research field today, with Example Based MT and Statistical MT representing two dissimilar frameworks in the data-driven paradigm. Example Based MT is another approach that involves matchin...

متن کامل

Urdu to English Machine Translation using Bilingual Evaluation Understudy

Machine Translation (MT) is exigent because it involves several thorny subtasks such as intrinsic language ambiguities, linguistic complexities and diversities between source and target language. Usually MT depends upon rules that provide linguistic information. At present, the corpus based MT approaches are used that include techniques like Example Based MT (EBMT) and Statistical MT (SMT). In ...

متن کامل

Statistical Approach to Transliteration from English to Punjabi

-Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Transliteration is a crucial factor in CLIR and MT. It is important for Machine Translation, especially when the languages do not use the same scripts. This paper addresses the issue of statistical mach...

متن کامل

Rule Based Approach for Machine Translation System for Related Languages: Punjabi to Hindi

Machine Translation is one of the important area in natural language processing. Machine Translation is a great challenge for closely related language pair. Machine Translation system for more or fewer related languages is based upon the similarities such as syntactic and vocabulary. Punjabi and Hindi both are originated from the same parent language so both are closely related and having lot o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016